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AN APPROACH FOR ASSESSING SOFTWARE PROTOTYPES 


V.E. Church, D.N. Card, W.W. Agresti, and Q.L. Jordan* 

ABSTRACT 


A procedure for evaluating a software prototype is pre- 
sented. The need to assess the prototype itself arises from 
the use of prototyping to demonstrate the feasibility of a 
design or development strategy. The assessment procedure 
can also be of use in deciding whether to evolve a prototype 
into a complete system. The procedure consists of identi- 
fying evaluation criteria, defining alternative design ap- 
proaches, and ranking the alternatives according to the 
criteria. 

INTRODUCTION 

A software prototype is a functionally incomplete model of a 
proposed system, built to demonstrate feasibility or explore 
potential requirements. Most of the interest in prototypes 
has focused on their development and their role in the soft- 
ware life-cycle. This article addresses prototype assess- 
ment — a topic that is less fully developed. Proptotyping 
has been used most frequently to gain an understanding of 
user requirements [Gomaa, Scott 81] . When prototyping is 
employed for this purpose, its benefits can be compared to 
those of other activities, such as specifying, as a way of 
proceeding in the early phases of a software development 
project. Several articles discuss the advantages and dis- 
advantages of the prototyping activity (e.g., [Alavi 84] or 
[Boehm et al. 84]). In this article, we consider the eval- 
uation and assessment, not of prototyping, but of the proto- 
type itself. When the software prototype is the object 
being evaluated, two questions are of interest: 

• Is the design concept feasible? 

• Is the prototype software an adequate basis for 
further development? 


*The authors are with Computer Sciences Corporation, System 
Sciences Division, 8728 Colesville Road, Silver Spring, 
Maryland 20910. 
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Prototyping requires the expenditure of organizational re- 
sources, and the resulting prototype, although not a com- 
plete system, does have some functionality. Organizations 
are not always inclined to throw the prototype away; one 
person's prototype is another person's system. Several ar- 
ticles recommend that organizations consider evolving the 
prototype into a completed system (e.g., [Duncan 82] or 
[Blum 83]). Making that decision is significantly different 
from deciding on the merits of prototyping because it re- 
quires evaluating the prototype itself: is it worth the in- 
vestment of more resources? 

When a prototype is being used to evaluate the feasibility 
of a particular design or development strategy, the proto- 
type itself also needs to be evaluated (see [Giddings 84] 
for a discussion of uncertainty in software design) . The 
prototype represents one possible approach to solving a 
problem. Evaluating the prototype requires the considera- 
tion of how well alternative designs or development strate- 
gies would have addressed the problem. This article will 
explain one procedure for assessing software prototypes and 
show how it was applied in evaluating an actual prototype. 

ASSESSMENT PROCEDURE 


The procedure for assessing a prototype includes three steps: 

1. Defining the assessment criteria 

2. Identifying the design alternatives 

3. Evaluating the alternatives 
Defining the Assessment Criteria 

The first step is to review the problem statement and ex- 
tract a relatively small number of high-level requirements 
to serve as criteria for assessment. We found that, based 
on the amount of effort required to treat each class prop- 
erly, the number of criteria should be on the order of 10. 
Five is probably a lower limit, and twenty is too many to 
assess in the timeframe implied by a development project. 

The assessment criteria represent the users' view of the 
problem. Each criterion should include a brief statement of 
requirement (one or two sentences) , a short narrative ex- 
planation (written in the users' terminology), and an iden- 
tifying phrase for use in tables and matrices. The intended 
audience of the assessment is the user — the requirements 
formalisms that are intended to support software developers 
are out of place here. 
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Identifying the Design Alternatives 


The prototype represents only one possible solution to the 
problem. The assessment procedure requires that alterna- 
tives be identified as well, so that the prototype can be 
assessed in the context of other approaches. The second 
step, then, is to identify approaches to the problem that 
might provide alternative solutions. The alternative solu- 
tions should be based on approaches that are reasonably well 
understood. It is helpful if an alternative approach can be 
linked to specific implementations that are concrete in- 
stances of the approach. 

The alternative solutions should be as different as possible 
given the constraints of the problem domain. Examples are 
the use of 

• Data base management systems instead of dedicated 
software 

• Fourth-generation instead of procedural languages 

• Interactive instead of batch processing 

• Distributed instead of centralized processing 

The review of such disparate alternatives will certainly 
increase the confidence level of the assessment and is 
likely as well to provide useful insights to the eventual 
development process. The prototype provides a sort of 
"depth-first" perspective; the examination of alternatives 
provides the complementary "breadth-first" review. 

As with the assessment criteria, the audience for the de- 
scriptions of the alternatives is the user. The alterna- 
tives should be couched in the users' terminology and 
presented in narrative form (perhaps a page or two of de- 
scription per alternative) . The number of alternatives will 
probably be quite small; given the interest in diversity, 
three to six alternatives will probably exhaust the spectrum 
of possibilities. Variations on a theme (say, different 
languages with central versus distributed processing) may 
increase this to the 5-to-2Q range noted above. 

Evaluating the Alternatives 

Once the sets of criteria and alternatives have been estab- 
lished, the work of judging relative merit can begin. The 
approach we found most effective was offline individual re- 
view leading up to group discussions at which a consensus 
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evaluation was formed. We found that it was essential to 
have several different views; no single outlook or experi- 
ence base could have provided the completeness of evaluation 
that we sought. 

The assessment step consists of ranking the alternatives in 
order of the degree to which they satisfy each criterion. 

The essence of the procedure is comparative evaluation of 
alternatives within each criterion — none of the assessment 
is performed in a vacuum. Because the prototype is avail- 
able for inspection and the alternatives are well under- 
stood, consensus is easily achieved. 

The basis for assigning scores, of course, is the relative 
value provided to the user--how well is the underlying cri- 
terion addressed by each approach? The outcome thereby rep- 
resents an evaluation of different design concepts tailored 
very specifically to the problem domain of study. The re- 
sult of the assessment is a profile of how well- the proto- 
type compares with less experimental approaches in the areas 
of greatest concern. This information provides the basis 
for decisions on developing the full system. 

CASE STUDY- -FLIGHT DYNAMICS ANALYSIS SYSTEM 

The Flight Dynamics Analysis System (FDAS) is a user-oriented 
research tool, still under development, that is intended to 
support spacecraft mission analysts. FDAS will provide com- 
putational assistance in planning mission profiles, examin- 
ing various computational strategies, and performing related 
flight dynamics ground support activities. It will largely 
replace a collection of single-use tools and old, much- 
modified mission analysis programs. Its primary goal (as a 
new development) is to provide a degree of separation be- 
tween the analysts (who are generally not particularly avid 
programmers) and the rather complex support software they 
require. FDAS is to provide a new, user-friendly approach 
to performing an existing arduous and error-prone task. 

The functional requirements for FDAS were extracted from the 
existing environment. The prototyped design strategy, how- 
ever, employed an innovative "software builder" approach not 
previously attempted for this problem. The planned system 
would maintain a library of linkable components and provide 
for their modification and use [Bassett, Giblon 83] . FDAS 
would provide an integrated system of functions and controls 
to simplify the programming requirements for the analysts 
who would use the system. 
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The prototyping effort was commissioned by the NASA Goddard 
Space Flight Center to provide a proof-of-concept demonstra- 
tion of FDAS and to investigate some possible alternatives 
in the user-interface area [Sukri, Zelkowitz 83] . The pro- 
totype assessment effort described here was part of a larger 
evaluation effort that included examination of comparable 
efforts elsewhere and actual use of the prototype by space- 
craft analysts. This report focuses only on the assessment 
of the proposed FDAS design (based on the prototype) per- 
formed by members of the Software Engineering Laboratory 
[Card et al. 82] . 

Step 1. Defining FDAS Assessment Criteria 

From the original requirements definition materials and from 
discussions with eventual users of the system, we developed 
a set of seven criteria for assessing the concept and design 
plan for FDAS. These criteria (Table 1) reflect our under- 
standing of the problem to be solved given the constraints 
of the environment. It was recognized that the new system 
had to be useful to the existing analysis staff, had to 
function on existing computer facilities, and had to be 
maintained by existing operations personnel. Given that, we 
identified the criteria described briefly below. 

• Minimize requirement for global knowledge of the 
application software — The user should be able to focus on 
the particular area of concern (e.g., a particular orbit 
propagator or integration routine) without having to compre- 
hend all of the housekeeping details of the programming sys- 
tem (data transfer, for example, or assignment of FORTRAN 
COMMONS) . 

• Minimize requirement for new system-level knowl- 
edge — The existing system required user familiarity with 
editors, compilers, linkers, and the execute modes of two 
different computers and operating systems. The new system 
should attempt to reduce the current load of system aware- 
ness and to minimize the need for additional system-level 
knowledge. 

• Maximize application-level flexibility and accessi- 
bility — The existing software buried functional routines 
deep within dedicated systems or combined them inextricably 
into small once-only tools. FDAS should provide accessi- 
bility to source code through functionally organized cate- 
gories. FDAS should further support low-level modification 
and test of such routines (for example, numeric precision, 
or type of integration step size determination) . 
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Table 1. Ranking of Alternatives Against Criteria for 
FDAS Case Study 3 _ 





CRITERIA 




ALTERNATIVES 

KNOWLEDGE OF 
APPLICATION 

LEARNING NEW SUPPORT 
SYSTEM 

FLEXIBILITY 

EASE OF MODIFYING 
APPLICATIONS 

EASE OF 

MODIFYING SYSTEM 

LEVEL OF INTEGRATION 

EASE OF 

IMPLEMENTATION 

TOTAL 

REDEVELOP EXISTING SOFTWARE 
• FORTRAN 

9 

1 

2.5 

9 


8 

1 

36.5 

• OTHER EXISTING LANGUAGE 

8 

6 

2.5 

8 


8 

4 

41.5 

• SPECIAL-PURPOSE LANGUAGE 


5 

5.5 



8 

7 

43.5 

BUILD COMPREHENSIVE DATA-DRIVEN 
PROGRAM 

• FORTRAN 

1 

3 

8 

1 

9 

2 

2 

28.0 

• OTHER EXISTING LANGUAGE 

1 

3 

8 

B 

8 

2 

5 

30.0 

• SPECIAL-PURPOSE LANGUAGE 

a 

3 

8 

1 

1 

2 

8 

32.0 

USE SOFTWARE BUILDER (PROTOTYPE) 
• FORTRAN 

6 


2.5 

H 

1 

5 

3 

32.5 

• OTHER EXISTING LANGUAGE 



2.5 


rf 

5 

6 

34.5 

• SPECIAL-PURPOSE LANGUAGE 


8 

5.5 

H 


5 

9 

36.5 


a 1 - BEST ALTERNATIVE TO SATISFY CRITERION. 

2 - WORST ALTERNATIVE TO SATISFY CRITERION. 
NOTE: AVERAGE SCORE AWARDED IN CASE OF TIES. 
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• Minimize effort for application-level modifica- 
tions — The analytical function of FDAS requires frequent 
changes to data and software. The effort required for these 
changes should be minimized. 

• Minimize effort for system-level modif ications--It 
is assumed that a maintenance group would be responsible for 
the addition of new capabilities (a new model of the magne- 
tosphere , for example) ; the analyst-users would not perform 
system-level changes. The requirement is that such major 
changes, providing new system-level functionality, be per- 
formed with minimal effort by the support group. 

• Provide support for integration of data, software, 
and analysis — The existing mode of operation involves modi- 
fication of the software followed by a number of tests and 
trials using different data and conditions. FDAS should 
support the data management function of repeating tests, 
logging runs, and analyzing or comparing output. 

• Minimize implementation difficulty — Different ap- 
proaches present different levels of technical difficulty 
and probable cost or risk. These aspects should be mini- 
mized. 

Step 2. Identifying FDAS Design Alternatives 

The assessment group defined two alternative approaches (in 
addition to the prototype) to providing the functionality 
required of FDAS. Three different programming language op- 
tions were also investigated as being applicable to the 
problem domain. 

The first alternative was to redevelop existing software. 

The essence of this approach is to repackage existing func- 
tionality within an improved user interface structure. No 
executive or data-processing functions would be provided 
except as already available (graphs and plots, for example, 
are provided in the existing system) . The users would con- 
tinue to rely on the various operating systems and utilities 
for support. 

The second alternative was to build a comprehensive data- 
driven program, a multifunction system with behavior con- 
trolled by user input (similar to various simulation 
packages, e.g., [Forman 76]). The program would provide 
both high- and low-level opportunities to control process- 
ing. This approach would (conceivably) completely divorce 
the user from any programming language by providing a higher 
order of functionality. It also would open the possibility 
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of using knowledge-based methods for extension and control 
of activities. 

The third alternative (that embodied by the prototype) was 
to use a software builder approach. The system would main- 
tain an organized library of functions and procedures and 
would support linking these elements in diverse and unfore- 
seen combinations. The system would support modification of 
stored routines (including compilation and linkage) and 
their execution by way of stored command sequences. This 
approach is similar to some programmer's workbench concepts; 
it integrates the normally distinct functions of edit/ 
compile/execute/analyze tools into a harmonious whole 
[Dolotta, Mashey 82] . 

Three language options were also investigated by the assess- 
ment group: FORTRAN, another existing language, or a 

special-purpose language. FORTRAN is treated separately 
because it is the language of most existing software and was 
used in developing the prototype. The cultural bias toward 
FORTRAN is very strong in the NASA Goddard environment, 
especially among analysts (whose backgrounds include more 
engineering, physics, and astronomy than computer science) . 
Any other existing language (for example, Pascal, Ada, 

HAL/S) would require substantial redevelopment of existing 
software; it would have to provide a significant added value 
to be seriously considered. A special-purpose language 
could be designed and developed specifically for flight dy- 
namics problems and computations. The FDAS prototype, in 
fact, included an extension to FORTRAN to support data ab- 
stractions and modularization (e.g., as in [Isner 82]). 

Such a language could be much closer to the natural algo- 
rithmic methods that are peculiar to spacecraft flight dy- 
namics, but would require both development and user training. 

Combining the seven assessment criteria with the nine alter- 
native approaches (three designs and three language options) 
produces the evaluation matrix shown in Table 1. 

Step 3. Evaluating FDAS Alternatives 

The evaluators considered each criterion in Table 1 by rank- 
ing how well each alternative addressed that criterion. The 
complete assessment involved considerably more discussion 
than is presented here [Card et al. 84] . To illustrate this 
evaluation step, the discussion and rationale for two of the 
distinguishing criteria are presented here. The results of 
the assessment are shown in Table 1. 
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• Knowledge of Application — If was clear that rede- 
veloped software might be easier to use than the existing 
collection of tools, but the difficulty of modifying the 
software would still be high. All of the normal difficulty 
of preventing side-effects and validating interfaces would 
still plague the analyst-users. We assessed FORTRAN to be 
the worst offender in this area and assumed that some other 
language (Pascal was our model) would provide somewhat ti- 
dier modularization. Any new language would have as a de- 
sign goal the minimization of such problems; we scored it 
higher accordingly. 

A comprehensive data-driven program would not permit the 
user to have access to code at a level of COMMONS and in- 
terfaces and so scored very high on this criterion. The 
development language for the data-driven system would be 
transparent to the user, thus the different language options 
provided no discrimination in analyst-user terms. 

The software builder approach specifically attempts to hide 
implementation details from the user by supporting inter- 
faces, data collection, system building, and execution with 
its own constructs. This approach was thus rated better 
than the redevelopment approach but less desirable than the 
data-driven program approach. The arguments for each lan- 
guage option, as discussed for software redevelopment, are 
applicable here; the rating is shown in Table 1. 

• Ease of system-level extension — A different pattern 
appeared with this criterion. The predominant sequence of 
language options — FORTRAN last, other language (e.g., 

Pascal) better, special language best--holds for each of the 
design approaches, but the relative rankings of those alter- 
natives is different. The software builder approach was 
judged most accessible to system-level changes and exten- 
sions of functionality, partly because the system itself 
provides some of the tools and means for its extension. The 
software builder would provide a structure more amenable to 
change in the directions expected for flight dynamics than 
is the case with existing software. The redevelopment ef- 
fort (as we envisioned it) would not provide such an inte- 
grated structure. The comprehensive data-driven program 
approach, because of its monolithic nature (as seen from the 
outside) would prove most difficult to extend. It should be 
noted that, in this instance, the language option did affect 
the ratings in the data-driven approach. The implementation 
language would have an effect on the ease of programming, as 
the ratings reflect. 
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For convenience, we provided a total column in Table 1 to 
summarize the evaluation across all of the evaluation crite- 
ria. In practice, such a total is an oversimplification of 
the analysis. The final assessment comes from assigning 
relative weights to each of the criteria and producing a 
weighted sum. This weighting enables the users' priorities 
to be reflected in the analysis results. 

FDAS Assessment Summary 

On the basis of the evaluation, the assessment team found 
that the prototype had served its purpose in establishing 
the feasibility of the overall FDAS goal and of the software 
builder approach in particular. However, the comparative 
advantages were not large, and not all elements of the pro- 
totype were favorably assessed. Drawing on the discussions 
of competing strategies, the assessment team also suggested 
some changes to the design approach. 

Partly as a result of this analysis, the project underwent 
an extended operational specification process, leading to a 
substantially revised design approach along with a greater 
understanding of how the system would be used. The project 
is now well into development. 

SUMMARY OF THE ASSESSMENT PROCEDURE 

Two aspects of the prototype assessment experience should be 
highlighted: the importance of identifying alternative so- 

lutions and the sensitivity of the analysis to the weighting 
of individual criteria. Evaluating an object in isolation 
is always difficult, whereas contrasting alternatives is 
usually easy. By identifying alternative solutions (includ- 
ing the prototyped alternative) to the problem, the evalua- 
tors' task is considerably simplified. We found that 
reaching a consensus evaluation of the prototype was facili- 
tated by the context provided in Table 1. Furthermore, the 
consideration of alternatives led to recommendations for 
improving the design approach. 

The assessment procedure described here provides a much 
richer analysis than a simple good/bad evaluation. Not only 
can the prototype be evaluated in the context of alternative 
approaches, but the comparative value of different features 
can also be defined. This procedure approximates competi- 
tive development ("flyoffs") more closely than does an ac- 
ceptance test evaluation, without requiring actual parallel 
development. This assessment involved a team of evaluators 
(part-time) over a period of months. The time period coin- 
cided with the prototype effort (so no schedule delays were 
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imposed) , but the assessment team was completely separate 
from, and in addition to, the development team. 

As noted above, the assessment procedure supports cost/ 
benefit analysis at a more detailed level than would other- 
wise be possible. This emphasizes the sensitivity of the 
procedure to choices of weighting factors for different cri- 
teria. Sensitivity analysis can be useful in identifying 
influential criteria and relatively stable alternatives. 

The FDAS experience indicates that this procedure is an ef- 
fective mechanism for evaluating the feasibility of a proto- 
typed design. It provides an organizing framework for 
expressing and employing knowledge gained from previous 
software development experience. 
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